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ABSTRACT 

We derive an accurate mass estimator for dispersion-supported stellar systems and 
demonstrate its validity by analyzing resolved linc-of-sight velocity data for globu- 
lar clusters, dwarf galaxies, and elliptical galaxies. Specifically, by manipulating the 
spherical Jeans equation we show that the mass enclosed within the 3D deprojected 
half-light radius r^^^ can be determined with only mild assumptions about the spatial 
variation of the stellar velocity dispersion anisotropy as long as the projected velocity 
dispersion profile is fairly flat near the half-light radius, as is typically observed. We 
find Mj^2 = (a^ J r^^^ ~ 4 0^1 {afj R^, where {afj is the luminosity -weighted 

square of the line-of-sight velocity dispersion and is the 2D projected half-light ra- 
dius. While deceptively familiar in form, this formula is not the virial theorem, which 
cannot be used to determine accurate masses unless the radial profile of the total 
mass is known a priori. We utilize this finding to show that all of the Milky Way 
dwarf spheroidal galaxies (MW dSphs) are consistent with having formed within a 
halo of mass approximately 3 x 10^ M©, assuming a ACDM cosmology. The faintest 
MW dSphs seem to have formed in dark matter halos that are at least as massive as 
those of the brightest MW dSphs, despite the almost five orders of magnitude spread 
in luminosity between them. We expand our analysis to the full range of observed 
dispersion-supported stellar systems and examine their dynamical I-band mass-to- 
light ratios T^^^. The T^^^ vs. M^^^ relation for dispersion-supported galaxies follows 

a U-shape, with a broad minimum near T^^^ ~ 3 that spans dwarf elliptical galaxies 

to normal ellipticals, a steep rise to T^^^ ~ 3,200 for ultra-faint dSphs, and a more 

shallow rise to T^^^ ~ 800 for galaxy cluster spheroids. 

Key words: Galactic dynamics, dwarf galaxies, elliptical galaxies, galaxy formation, 
dark matter 



1 INTRODUCTION 

Mass determinations for dispersion-supported galaxies 
based on only line-of-sight velocity measurements suffer 
from a notorious uncertainty associated with not know- 
ing the intrinsic 3D velocity dispersion. The difference be- 
tween radial and tangential velocity dispersions is usually 
quantified by the stellar velocity dispersion anisotropy, p. 
Many questions in galaxy formation are affected by our ig- 
norance of /?, including our ability to quantify the amount 
of dark matter in the outer parts of el liptical galaxies 
l|Romanowskv et al.ll200 j : iDekel et al.ll2005l ) , to measure the 
mas s profile of the Milky Way from stellar h alo kinemat- 
ics (iBattaglia et al.l 120051 : iDehnen et ai1l2006l) , and to in- 



fer accur ate mass distributions in dwarf spheroida l galaxies 
(dSphs) l|Gilmore et al.ll2007l : [Strigari et al.ll2007bD . 

Here we use the spherical Jeans equation to show that 
for each dispersion-supported galaxy, there exists one radius 
within which the integrated mass as inferred from the line- 
of-sight velocity dispersion is largely insensitive to /3, and 
that this radius is approximately equal to r^, the location 
where the log-slope of the 3D tracer density profil^B is —3 

^ In this paper we will often refer to the stellar number density 
profile, but this work is applicable to any tracer system, including 
planetary nebulae and globular clusters that trace galaxy poten- 
tials, and galaxies that trace galaxy cluster potentials. 
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(i.e., dlnn*/dlnr = —3). Moreover, the mass within r., is 
well characterized by a simple formula that depends only on 
quantities that may be inferred from observations: 



(1) 



where M(r) is the mass enclosed within a sphere of radius r, 
o",^^ is the line-of-sight velocity dispersion, and the brackets 
indicate a luminosity-weighted average. For a wide range of 
stellar light distributions that describe dispersion-supported 
galaxies, r.^ is close to the 3D deprojected half-light radius 
r^ ,2 and therefore we may also write: 



M 



1/2 = ^(l"i/2) - 3G ^ (crf^Jr^^^ , 



(2) 



930 



km^ s~2 



^) Me. 
pc/ 



In the second line we have used R„ ~ (3/4) r^^^ for the 2D 
projected half-light radius. This approximation is accurate 
to better than 2% for exponential, Gaussian, King, Plum- 
mer, and Sersic profiles (see Appendix |B] for useful fitting 
formulae) . 

As we show below. Equation O can be understood un- 
der the assumption that the observed stellar velocity disper- 
sion profile is relatively flat near R^. Clearly, one can write 
down self-consistent models that violate this assumption. In 
these cases, the mass uncertainty is minimized at a radius 
other than rj^^^i ^-nd Equation [2] will no longer be as accu- 
rate. However, the velocity dispersions of real galaxies in the 
Universe (including those we consider below) do appear to 
be rather flat near the half-light radius, thus validating the 
use of Equation [21 

In the next section we discuss the spherical Jeans equa- 
tion and our method for determining generalized, maximum- 
likelihood mass proflle solutions based on line-of-sight veloc- 
ity measurements. As a point of comparison we also discuss 
the virial theorem as a mass estimator for spherical systems. 
In §3 we derive Equation [21 show that it works using real 
galax;y data, and explain why the /3 uncertainty is minimized 
at r ~ r3 ~ r^^^ for line-of-sight kinematics. In §4 we present 
two examples of how M^^^ determinations can be used to 
inform models of galaxy formation: flrst, we show that the 
Mj^2 vs. rj^^2 relationship for Milky Way dSph galaxies pro- 
vides an important constraint on the type of dark matter 
halos they were born within; and second, we examine the 
dynamical half-light mass-to-light ratios for the full range 
of dispersion-supported stellar systems in the Universe and 
argue that this relationship can be used to inform models of 
feedback. We conclude in §5. 

In this paper the symbol R will always refer to a pro- 
jected, two-dimensional (2D) radius and the symbol r will 
refer to a deprojected, three-dimensional (3D) radius. 



2 REVIEW AND METHODOLOGY 

In what follows we review the virial theorem as a mass esti- 
mator for spherical systems, introduce the Jeans equation, 
and present our numerical methodology for using the Jeans 
equation to provide general mass likelihood solutions based 
on line-of-sight kinematic data. We will use these generalized 
mass solutions to evaluate our M^^^ estimator in §3. 



2.1 The Scalar Virial Theorem 

The scalar virial theorem (SVT) is perhaps the most pop- 
ular equation used to p rovide rough mass constraints for 
spheroidal galaxies (e .g., jPovedalflQSi : ITuUv fc Fishe3ll977l : 
iBusarello etal] Il997l '). Consider a spherically symmetric 
dispersion-supported galaxy with a total gravitating mass 
profile M(r), which includes a 3D stellar mass density 
p*(r) = mi,{r)ni,{r) that truncates at a radius r,j^ 0m*(r) 
quantifies the distribution of stellar mass per normalized 
number while the stellar number density n*(r) is normal- 
ized to integrate to unity over the stellar volume. If m<.(r) 
is assumed to be constant, then the SVT can be expressed 
as: 



ni,(r) M{r) r dr 



n4r)al^{r)A'r (3) 



= Kit) =3«J. 



Note that the luminosity-weighted average of the square of 
the total velocity dispersion a^^^ is independent of j3, and 
thus if one knows the number density (either by recording 
the position of every single star, or by making an assump- 
tion about how the observed surface brightness relates to 
the number density), the SVT provides an observationally- 
applicable constraint on the integrated mass profile within 
the stellar extent of the system. 

Unfortunately, the constraint associated with the SVT 
is not particularly powerful as it allows a family of ac ceptable 
solutions for M{r). This point was emphasized bv iMerritti 
l| 19871 . Appendix A), who considered two extreme possibil- 
ities for M(r) (a point mass and a constant density distri- 
bution) to show that the SVT constrains the total mass Mt 
within the stellar extent r^^^^ to obey 



in') - 3 



(4) 



where (r7^) and {r1) are moments of the stellar distribution. 
The associated constr aint is quite weak. For example, if we 
assume (r) follows a lKind (|l962l ') profile with r^.^/R^ = 5 
(typical for Local Group dwarf spheroidal galaxies) Equation 
[31 allows a large uncertainty in the mass within the stellar 
extent: 0.7(o-f^J < GUt/r^,^ < 20{alJ. 

Another common way to express the SVT is to 
first define a gravitatio nal radius rg = GMt/|W^| 
ijBinnev fc Tremainell2008f ) , where W is the potential energy, 
which depends on the unknown mass profile. By absorbing 
our ignorance of the mass profile into rg, we can write the 
total mass as 

Mt = {^L ) rg = 3 {alj r,. (5) 

In the literature it is common to rewrite Equation [SI as 

M, = kG-' {afjR^, (6) 

where k = 3rg/R^ is referred to as the 'virial coefficient'. If 
one wishes to re-express this version of the SVT in a form 
analogous to what we have in Equation [21 we need to relate 
Mt to the mass enclosed within r^^^ i which again requires 



^ The total mass density need not truncate at this radius. 
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knowledge of the mass profile AI{r) = /(r) Mt: 

= /(r,,,)Mt = /(r,/,)fcG-^(af„Jr,,, (7) 

Note that the value of c{t^^^) depends on the (unknown) 
mass profile through both /(rj^^j) ^'^'^ ^- Below, using an 
alternative analysis, we show that c{r-^^^) = 4 under circum- 
stances that are fairly general for observed galaxies. 

2.2 The Spherical Jeans Equation 

Given the relative weakness of the SVT as a mass estima- 
tor, the spherical Jeans equation provides an attractive al- 
ternative. It relates the total gravitating potential "I>(r) of 
a spherically symmetric, dispersion-supported, coUisionless 
stationary system to its tracer velocity dispersion and tracer 
number density, under the assumption of dynamical equilib- 
rium with no streaming motions: 



d£ 

' dr 



dr 



+ 2i 



(8) 



Here ar{r) is the radial velocity dispersion of the 
stars/tracers and /3(r) = 1 — a"^ /a^ is a measure of the ve- 
locity anisotropy, where the tangential velocity dispersion 
(Tt = ag = (70. It is informative to rewrite the implied total 
mass profile as 



A^(r) = ^(7*+7.-2/3), 



(9) 



where 7^, = — dlnn<,/dlnr and 'j^r = — dlna^/dlnr. With- 
out the benefit of tracer proper motions (or some assumption 
about the form of the distribution function), the only term 
on the right-hand side of Equation[5]that can be determined 
by observations is 7*, which follows from the projected sur- 
face brightness profile under some assumption about how 
it is related to the projected stellar number density S*(i?)lf| 
Via an Abel inversion (Eauation lA2[) we map n* in a one-to- 
one manner with the spherically deprojected observed sur- 
face brightness profile (i.e., we assume that the number den- 
sity traces the light density). As we discuss below, ar{r) can 
be inferred from o-j^^(i?) measurements, but this mapping 
depends on /3(r), which is free to vary. 

2.3 Mass Likelihoods from Line-of-Sight Velocity 
Dispersion Data 

Line-of-sight kinematic data provides the projected veloc- 
ity dispersion profile a^^^{R). In order to use the Jeans 
equation one must rel ate cr,^^ to ar (as first shown by 
iBinnev fc MamonlflOS^ '): 



T}2 

1 - ^/?(^) 



dr" 



(10) 



It is clear then that there exists a significant degeneracy as- 
sociated with using the observed E*(_R) and iJi„,{R) profiles 
to determine an underlying mass profile M{r) at any radius, 
as uncertainties in /3 will affect both the mapping between 



(7r and a^^^ in Equation [TO] and the relationship between 
M{r) and ar in Equation (9] 

One technique for handling the /3 degeneracy and pro- 
viding a fair representation of the allowed mass profile given 
a set of observables is to consider general parameterizations 
for /3(r) and M(r) and then to undertake a maximum likeli- 
hood analysis to constrain all possible parameter combina- 
tions. In what follows, we use such a strategy to derive mean- 
ingful mass likelihoods for a number of dispersion-supported 
galaxies with line-of-sight velocity data sets. We will use 
these general results to test our proposed mass estimator. 
Our gener al technique is des cribed in the supplementary sec - 
tion of St rigari et all (|2008l ') and in [Martinez et al.l (|2009h . 
We refer the reader to these references for a more complete 
discussion. 

Briefly, for our fiducial procedure we model the stellar 
velocity dispersion anisotropy as a three-parameter function 



/3(r) = (/3i-/3o)- 



+ /3o, 



(11) 



and model the total mass density distribution using the six- 
parameter function 



Ptot(r) = 



Ps e 



r/r 



(12) 



(r/rs)T[l + (r/r,)«]('5-T)/° ' 

For our marginalization, we adopt uniform priors over the 
following ranges: logio(0.2 r^^^) < logi(,(r^) < logio(r„„); 
-10 < Pi < 0.91; -10 < Po < 0.91; logio(0.2 r^^ J < 
logio(rs) < logio(2ri,.gj; < 7 < 2; 3 < 5 < 5; and 
0.5 < a < 3, where we remind the reader that r^.^^ is the 
truncation radius for the stellar density. The variable r^^^ 
allows the dark matter halo profile to truncate at some ra- 
dius beyond the stellar extent and we adopt the uniform 
prior logio(r,i„) < logio(r,„t) < logio (^high ) in our marginal- 
ization. For distant galaxies we use rj^^^j^ = lOrjj^ and for 
satellite galaxies of the Milky Way we set rj^j^j^ equal to the 
Roche limit for a 10^ M0 point mass. In practice, this al- 
lowance for r^.^^ is not important for our purposes because 
we focus on integrated masses within the stellar radius0 

We also investigate the effects of a more radical model 
for the stellar velocity dispersion anisotropy that allows /3(r) 
to have an extremum within the limiting radius. The specific 
form we use in this second model is 



I3{r) = /3o + (/?i - A.) 



(13) 



which allows for mild and large variations within the stellar 
extent depending on the value of r^ . We use the same pri- 
ors for this functional form as those for our fiducial model 
(Equation [TT]). A caveat that bears mentioning is that nei- 
ther of our /3(r) profiles allow for multiple extrema, but they 
do allow for large variations in /3(r) with radius. Our mo- 
tivation for investigating these large variations is not based 
on physical arguments for their existence, but rather to see 
if the validity of our mass estimator breaks down. 

Below we apply our marginalization procedure to re- 
solved kinematic data for MW dSphs, MW globular clus- 
ters, and elliptical galaxies. Since MW dSphs and globular 



^ One can make progress if enough individual spectra are ob- We have explored other prior distributions and find that the 

tained such that the population has been evenly sampled. How- results of our likelihood analysis for M^^^ ^I'c insensitive to these 

ever, ensuring that this condition has been met is not trivial. choices. 
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3D Physlcol Rodlus [kpc] 3D Physicol Rodius [kpc] 

Figure 1. Left: The cumulative mass profile generated by analyzing the Carina dSph using four different constant velocity dispersion 
anisotropics. The lines represent the median cumulative mass value from the likelihood as a function of physical radius. The width of the 
mass likelihoods (not shown) do not vary much with radius and are approximately the size of the width at the pinch in the right panel. 
Right: The cumulative mass profile of the same galaxy, where the black line represents the median mass from our full mass likelihood 
(which allows for a radially varying anisotropy). The different shades represent the inner two confidence intervals (68% and 95%). The 
green dot-dashed line represents the contribution of mass from the stars, assuming a stellar V-band mass-to-light ratio of 3 Mq/Lq. 



clusters are close enough for individual stars to be resolved, 
we consider the joint probability of obtaining each observed 
stellar velocity given its observational error and the pre- 
dicted line-of-sight velocity dispersion from Equations [8] and 
1101 In modeling the line-of-sight velocity distribution for any 
system, we must take into account that the observed distri- 
bution is a convolution of the intrinsic velocity distribution, 
arising from the distribution function, and the measurement 
uncertainty from each individual star. If we assume that 
the line-of-sight velocity distribution can be well-described 
by a Gaussian, which is obser vationally consisten t with the 
best-studied samples (see, e.g. JWalker et al]|2007l ). then the 
probability of obtaining a set of line-of-sight velocities 'V 
given a set of model parameters ^ is described by the like- 
lihood 



Markov Chain Monte Carlo technique in order to perform 
the required ten to twelve dimensional integrally] Before mov- 
ing on, we note that the Gaussian assumption made here is 
not entirely general, and thus is a limiting aspect of our 
mass modeling. While most dSph velocity distributions are 
consistent with Gaussian to within membership errors and 
errors associated with th e possibility of binary star popula- 
tions ijMinor et al.ll201(]| ). a small amount of excess kurtosis 
is measured in the outer parts of some systems (|Lokasll2009l l. 

For elliptical galaxies that are located too far for indi- 
vidual stellar spectra to be obtained, we analyze the resolved 
dispersion profiles with the likelihood 



n 



27rei 



exp 



2 



(15) 



p(rK) = n 



: exp 



1 {y^ 



(14) 



The product is over the set of A'^ stars, where v is the 
average velocity of the galaxy. As expected, the total error 
at a projected position is a sum in quadrature of the theo- 
retical intrinsic dispersion, (Tth,i(^), and the measurement 
error ei. We generate the posterior probability distribution 
for the mass at any radius by multiplying the likelihood by 
the prior distribution for each of the nine /3(r) and ptot{r) 
parameters as well as the observationally derived parame- 
ters and associated errors that yield n*(r) for each galaxy, 
which include uncertainties in distance. We then integrate 
over all model parameter s, including v, t o deri ve a likeli- 
hood for mass. Following [Martinez et all (|2009l ). we use a 



where the product is over the set of A'' dispersion measure- 
ments and ti is the reported error of each measurement. 



^ The volume of parameter space changes depending on the num- 
ber of free parameters used to fit the photometry of each system, 
along with the availability of photometric uncertainties. For each 
MW dSph we have taken care to ensure that we used what we 
consider to be the most reliable photometry that include obser- 
vational errors. 
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Table 1: Observed and derived properties of spheroidal galaxies considered in this paper. 
Galaxy Distance Luminosity Kg r^^^ 2D 3D r^^^ xA^kT) ^1/2 """i/ 





[kpc 






[L 


3,vl 




[arcmin] 




[arcmin] 


[pc] 




[pc] 




[km s' 






[Mq] 




[Mq/Lq.v] 


Carina (723) 


105 ± 2 


{a) 


4.3 


+1.1 
-0.9 


X 


lO'^ 


b) 




8.8 ± 1.2 ' 


^1 


28.8 ± 3.6 


254 


± 


28 


334 


± 


37 


6.4 


± 


0.2 


9.56 


+0.95 
— 90 


X 


10*^ 


. . 4-1 


Draco (206) 


76 ±5 


(d) 


2.2 


+0.7 
-0.6 


X 


lO'^ 


b) 


7.63 ± 0.04 


y^J 


45.1 ± 0.6 


220 


± 


11 


291 


± 


14 


10.1 


± 


0.5 


2.11 


+0.31 
— 31 


X 


10^ 


2001^° 


Fornax (2409) 


147 ±3 


(a) 


1.7 


+0.5 
-0.4 


X 


10^ 


'b) 


13.7 ± 1.2 




71.1 ± 4.0 


714 


± 


40 


944 


± 


53 


10.7 


± 


0.2 


7.39 


+0.41 

Q 3Q 


X 


10^ 


8.7tl:l 


Leo I (305) 


254 ± 18 


5.0 


+1.8 
-1.3 


X 


10'^ 


(b) 


6.21 ± 0.95 


(a) 


11.70 ± 0.87 


295 


± 


49 


388 


± 


64 


9.0 


± 


0.4 


2.21 


+0.24 
-0.24 


X 


10 


Q Q + 3 4 

5.»_2 4 


Leo II (168) 


233 ± 15 


7.8 


+2.5 
-1.9 


X 


10^ 




2.64 ± 0.19 


y^) 


9.33 ± 0.47 


177 


± 


13 


233 


± 


17 


6.6 


± 


0.5 


7.25 


+ 1.19 
-1.01 


X 


10'^ 


19Ig 


Bculptor (1355) 


86 ±5 


(i) 


2.5 


+0.9 
-0.7 


X 


10^ 


(b) 


5.8 ± 1.6 ' 


c) 


76.5 ± 5.0 ^ ' 


282 


± 


41 


375 


± 


54 


9.0 


± 


0.2 


2.25 


+0.16 
-0.15 


X 


1 n7 
10 


1 c+6 


Sextans (423) 


96 ±3 


(fc) 


5.9 


+2.0 
-1.4 


X 


10-^ 


(i)) 


16.6 ± 1.2 


(c) 


160.0 ± 50.0 


768 


± 


47 


1019 


± 


62 


7.1 


± 


0.3 


3.49 


+0.56 
-0.48 


X 


10^ 


120+« 


Ursa Minor (212) 


77 ±4 


(0 


3.9 


+1.7 
-1.3 


X 


10^ 


(b) 


17.9 ±2.1 < 


m) 


77.9 ± 8.9 


445 


± 


44 


588 


± 


58 


11.5 


± 


0.6 


5.56 


+0.79 
-0.72 


X 


10^ 


290t^« 



Bootes I (12) 


66 ± 3 


2.8+H X 10'' 


7 ci+0.60 
' •^-'--0.54 


~ 45 


242^22 


099+29 


9.0 ± 2.2 


9 oc+2.01 


X 


10^ 


1700tir 


Canes Venatici I (214) 


218 ± 10 


2.3i;;-^ X 10^ 


t- or|+0.24 


~ 50 




750+f8 


7.6 ± 0.5 


9 77+0.86 
^- ' ' -0.62 


X 


10^ 


2401^^ 


Canes Venatici II (25) 


160 ± 5 


7.9l:*:o X 10' 


"•^^-0.12 


~ 10 


74+14 
' ^-10 




4.6 ± 1.0 




X 


10*^ 


3601?*° 


Coma Berenices (59) 


44 ± 4 <«' 


3.7+1-4 X 10' 


q r7+0.36 
-0.36 


~ 18 


77+10 
' ' -10 


IOO+I3 


4.6 ± 0.8 


1 07+0.88 

r.y( _o.6o 


X 


10'^ 


1100l«°g 


Hercules ^''^ (30) 


133 ±6 


l.lttl X 10^ 


n ro+O.SO 
'^•"-'^-0.30 


~ 40 


229+1^ 


305li 


5.1 ± 0.9 


7.501^:^ 


X 


10'^ 


woot^r 


Leo IV (17) 


160 ± 15 '"^ 


8.7t^:^ X 10' 


1 40+0-30 


~ 15 


116+^« 




3.3 ± 1.7 


1 14+3-50 
^•J^^-0.92 


X 


10*^ 


260+1°°° 


Leo T (18) 


407 ± 38 


1.4 X 10^ 


0.68_o og 


4.8 ± 1.0 


ii5tr7 


152+21 


7.8 ± 1.6 


7 07+4.84 
' "J' -2.96 


X 


10*^ 


110+''° 
ri"_4o 


Segue 1 (24) 


23 ± 2 <«' 


3.4+? ° X 10^ 


2 62+°-^^ 

^•"^-0.36 


~ 20 


29+5 


3811° 


4.3 ± 1.1 


6 01+5 °^ 


X 


10^ 


3500+5°°° 


Ursa Major I (39) 


97 ± 4 


1.4l°l X 10" 


'■J-0.77 


~ 50 


318t^° 




7.6 ± 1.0 


1 26+°'"^ 

J^-^O-0.43 


X 


10^ 


1800+1^°° 


Ursa Major II (20) 


32 ± 4 


4.oii:4 X 10' 


'^■'-'^-0.60 


~ 50 


140t^5 


184lf3 


6.7 ± 1.4 


7.9it^:S 


X 


10^ 


4000111°° 


Willman 1 (40) 


38±7(''' 


i.oi^:^ X 10' 


1 07+0.12 

-0.24 


~ 9 






4.0 ± 0.9 


3.86t?:« 


X 


10'^ 


770+930 

' '"-440 


NGC 185 (n=1.2 <'">) 


616 ± 26 


1.1 X 10** 


1.49 


~ 14.9 


266 


355 


31 ± 1 




X 10* 


5.31};! 


NGC 855 (n=1.9 <"'') 


9320 


1.1 X 10^ 


0.23 


~ 2.30 


624 


837 


58 ± 3 


9 ^0+0.54 


X 10'' 


4.5lJ:^ 



NGC 499 (n=3.6 <'">) 62300 ^"^^ 4.1 x lO"' 0.25 ~ 2.50 4500 6070 274 ± 7 3.27lo x lO" 16: 

NGC 731 (n=3.8 <'">) 52700^^' 3.9 x 10i° 0.24 ~ 2.40 3600 4850 163 ± 1 8.52li °^ x 10i° 4.4; 

NGC 3853 (n=4.0 <""') 44600 '"^^ 2.1 x 10i° f'^"' 0.24 ''^'^^ ~ 2.40 3050 4110 198 ± 3 8.54li;4^ x 10i° 8.I; 

NGC 4478 (n=2.07 <'*''') 16980 7.0 x 10^ 0.22 1.73 1110 1490 147 ± 1 1.96lo 33 x 10i° 5.6; 



Note: Galaxies are grouped from top to bottom as pre-SDSS/classical MW dSphs, post-SDSS MW dSphs, dwarf elliptical galaxies (dEs), and elliptical galaxies (Es). 
Within the parentheses next to each MW dSph is t he number of s tars an alyzed. The dSphs with errors on rj^^^ are fit with King profiles (where Rq = r^^^^). Those 
without sources for rj^^^^ are estimated from Figure 1 of iMartin et al.l (|2008bl ) (we found that our Mj^^^ determinations were largely insensitive to the choice of reasonable 
T^.^ values). Except for Leo T, all of the post-SDSS dwarfs are fit with truncated exponential light distributions (where Rq is the exponential scale length derived from 
the half-light radius). The dEs and Es are fit with truncated Sersic profiles, where each limiting radius is not usually quoted in the literature. Also note that errors on 
the masses are approximately normal in logio(Mj^2)- Lastly, note that the quoted errors in the luminosities and in the dynamical mass-to-light ratios were derived in 
this paper and are also approximately log-normal. For the classical dSphs we took into account the errors in the apparent magnitudes and the errors in the distance 

estimates. For the post-SDSS dSphs we considered the quoted errors in absolute magnitudes. 
References: Values in column 5 (2D R^) for the classical MW dSphs and Leo T, and the values in columns 6-9 for all of the MW dSphs are derived in this paper from 
the quoted elliptical fits to the surface brightn ess profiles from the cited sour ces (this convention differs from the geometric means that are sometimes quoted from the 

equ ivalent elliptical fits ( see, e.g.. Section 3 of llrwin fc Hatzidimitriou *'l995V Except for Hercules and Leo T, values in columns 2-5 of the post-SDS S MW dSphs are 
from lMartin et all (|2008bh . Lastly, the values in columns 5-9 for the dEs an d E s are derived in this paper. Th e in dividual refererices are as follows: a) iPietrzyriski et al.l 

(l2009l) b) Rederiy ed f rom apparent magnit ude s listed inlMateol ll 19981) . c ) llrwin fc Hatzidimitrioul (Il995h. dllfionanos et all (|2004l'). e'llSegall et all ll2007h. f) 
iBellazzini et al.1 |200i), g) [Smoleic et al' (2001*), h) 'Bellazzi ni et al.1 (120051) . i) |Coleman et al.' 
(|2002|). m ) RGB tracers fro m Palma et al. (2003), n) Dall'O ra et'aLl (120061). 0) iMartin et al. 
Ide Jorig et all lt2008l). t)IOkamoto et a l. (2008). u) IZucker et 



'2007 



il2oog). 



), j) iPietrzynski et al.l (|2008| ) k)lLee et all (|2003|). 1 ) ICarr era et al.l 
61), o)LMartin et al. (200 8^), p) Greco et al.1 (I2OO8I ). a)l Belokurov et al.1 (|2007|). r )[Sand et al, 

^ al] (|2006al ). v) iWillman et al.l lioOSal ). w) Derived from]P rugnicl fc Hcraudcau ( 1998), x) 

iMcConnachie et al.l (|2005|). v)ISimien fc Prugniell (12002). z) Quot e d from NASA/IPAC Extragalactic Database, aa) ISimien fc Prugniel (.2000). bb) .Simien fc Prugniell 
(|l997d ). cc) ISimien fc Prugniell (|l997bl ), dd) lKormendv et al.1 (|2009|). who present sim ilar parameters to those the originally derived in IPerrarese et al.l (|2006l )" 
*)Luminosities derived from applying B — V values calculated in iFukugita et"al] (|l995l ). Lastly, the references for the kinematic data used to derive the velocity 

dispersions are listed in the caption of Figure [2l 



6 J. Wolf et al. 



3 MINIMIZING THE ANISOTROPY 
DEGENERACY 

3.1 Expectations 

Qualitatively, one might expect that the degeneracy between 
the integrated mass and the assumed anisotropy parameter 
will be minimized at some intermediate radius within the 
stellar distribution. Such an expectation follows from con- 
sidering the relationship between cTj^^ and Or- 

At the projected center of a spherical, dispersion- 
supported galaxy (_R = 0) , line-of-sight observations project 
onto the radial component with cTj^^ ~ Or, while at the edge 
of the galaxy {R = r^^J, line-of-sight velocities project onto 
the tangential component with a^^^ ~ at- For example, con- 
sider a galaxy that is intrinsically isotropic (/3 = 0). If this 
system is analyzed using line-of-sight velocities under the 
false assumption that Gt > crt (P > 0) at all radii, then the 
total velocity dispersion at r ~ would be underestimated 
while the total velocity dispersion at r ~ rj^^^ would be 
overestimated. Conversely, if one were to analyze the same 
galaxy under the assumption that ar < at {P < 0) at all 
radii, then the total velocity dispersion would be overesti- 
mated near the center and underestimated near the galaxy 
edge. It is plausible then that there exists some interme- 
diate radius where attempting to infer the enclosed mass 
from only line-of-sight kinematics is minimally affected by 
the unknown value of /3. 

These qualitative expectations are borne out explicitly 
in Figure [1] where we present inferred mass profiles for the 
Carina dSph galaxy for several choices of constant /3. The 
right-hand panel shows the same data analyzed using our full 
likelihood analysis, where we marginalize over the fairly gen- 
eral /3(r) profile p resented in Equation [TTI We use 723 stel- 
lar velocities from IWalker et all (|2009al ) with the constraint 
that their membership probabilities (which are based on a 
combination of stellar velocity and metallicity) are greater 
than 0.9, and in projection they lie within 650 pc of the 
center (which is below the lower limit of r^^^ given in Table 
1). The average velocity error of this set is approximately 3 
km . Each line in the left panel of Figure [1] shows the me- 
dian likelihood of the cumulative mass value at each radius 
for the value of /3 indicated. The 3D half-light radius and the 
limiting stellar radius are marked for reference. As expected, 
forcing > produces a systematically lower (higher) mass 
at a small (large) radius compared to /? < 0. This of course 
demands that every pair of M{r) profiles analyzed with dif- 
ferent assumptions about /? cross at some intermediate ra- 
dius0 Somewhat remarkable is the fact that every pair in- 
tersects at approximately the same radius. We see that this 
radius is very close to the deprojected 3D half-light radius 
r^^j. The right-hand panel in Figure [1] shows the full mass 
likelihood as a function of radius (which allows for a radially 
varying anisotropy), where the shaded bands illustrate the 
68% and 95% likelihood contours, respectively. The likeli- 
hood contour also pinches near T-^^^, as this mass value is 
the most constrained by the data. 

By e xamining each of the well-sampled dSph kinematic 
data sets (iMufioz et al.ll2005l : iKoch et ahllioOTl : iMateo et all 

° Ivan der Marel et al. demonstrated a comparable result 

with more restrictive conditions. 



I2OO8I : I Walker et alll2009ah in more detail, we find that the 
error on mass near r^^^ is always dominated by measurement 
errors (including the finite number of stars) rather than the 
P uncertainty, while the mass errors at both smaller and 
larger radii are dominated by the /3 uncertainty (and thus 
are less affected by measurement error) Q We now explain 
this result by examining the Jeans equation in the context 
of observables. 



3.2 Why is the mass within half-light radius 

insensitive to velocity dispersion anisotropy? 

Here we present the derivation of Equations [T] and [5] We 
start by analytically showing that there exists a radius 
r^q within which the dynamical mass will be minimally 
affected by the velocity dispersion anisotropy, /3(r). We 
then consider two cases of interest for observed dispersion- 
supported systems. First, we consider the case when the ve- 
locity dispersion anisotropy is spatially constant and show 
that r^q ~ where is an observable defined such that 
7* = — dlnn*/dlnr = 3 at r = rg. Second, we extend 
our analysis to allow for non-constant /3(r) and show that 
under mild assumptions about the variation of P{r), the 
mass within radius r, is insensitive to the velocity dispersion 
anisotropy. 

While the steps outlined above provide a deeper insight 
into Equation [T] the essence of our arguments can be laid 
out in a few lines. We begin by rewriting the Jeans equation 
such that the P{r) dependence is absorbed into the definition 
of al^ =a^^ + al + al^{3- 2/3) a^: 

GM{r)r-^ = al^ (r) + a^^r) (7. + 7. - 3) . (16) 

We then note that if 7CT(r3) <^ 3 (as our numerical compu- 
tations show it must be for flat observed f,^^(_R) profiles), 
then at r = rj the mass depends only on a^^^ and we may 
write 

M{r,) ~ G-'ai^{r,)r,:^G-'{ai^)T, (17) 

where the last line is Equation [1] We remind the reader that 
the brackets indicate a luminosity-weighted average over the 
entire system. In the above chain of arguments we have used 
the relation (a^^^) ~ o-^^^ir^)- We will show why this is a 
good approximation in Section [3.2.21 

Finally, we show in Appendix |B] that the log-slope of 
71* is approximately 3 at the deprojected half-light radius 
rg ~ r^^, for most common light profiles, and therefore the 
last line of Equation [17] provides our mass estimator (Equa- 
tion [2J. For example , r, 0.94 r^^^ for a Plummer profile 
an d r, ~ 1.15 t ,^ for lKinel l| 19621 ) profiles and for the family 
of ISersid (| 19681 ) profiles with n = 0.5 to 10. The relation- 
ships between r^^^ and the observable scale radii for various 
commonly-used surface density profiles are provided in Ap- 
pendix |B] 



^ A similar effect w as discussed but not fully explored in 
IStrieari et al] l l2007bl) . 
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3.2.1 Existence of a radius r^^ where the mass profile is 
minimally affected by anisotropy 

Consider a velocity dispersion-supported stellar system that 
is well studied, such that and ct^^^{R) are determined 

accurately by observations. If we model this system's mass 
profile using the Jeans equation, any viable solution will 
keep the quantity (R) a^^^ (R) fixed to within allowable 
errors. With this in mind, we rewrite Equation [TO] in a form 
that is invertible, isolating the integral's _R-dependence into 
a kernel: 



+ 



-df^ 



dr" 



i?2 

(18) 

We explain this derivation in Appendix A, where we also 
perform an Abel inversion to solve for (Tr(r) and Af(r) in 
terms of directly observable quantities (while we were writ- 
ing this paper we learned that Mamon & Boue 2010 had 
independently performed a similar analysis.) 

Because Equation [T5] is invertible, the fact that the left- 
hand side is an observed quantity and independent of /3 im- 
plies that the term in brackets must be well determined re- 
gardless of a chosen /J. This allows us to equate the isotropic 
integrand with an arbitrary anisotropic integrand: 



1/3=0 



n*(T^[l - 



/3n* 



(19) 



We now take a derivative with respect to Inr and subtract 
Equation [8] to obtain the following result 



M(r;/3) - M(r; 0) 



fiir)rol(r-) 
G 



We remind the reader that 7* 



(7* + 7<. + 7/3 - 3) . (20) 

— din 71* /din r and 70- = 
— dlncTr/dlnr. Following the same nomenclature, 7/3 = 
— dln/3/dlnr = ~j3' /P, where ' denotes a derivative with 
respect to Inr. 

Equation [20] reveals the possibility of a radius r^^ where 
the term in parentheses goes to zero, such that the enclosed 
mass M{r^^) is minimally affected by our ignorance of /3(r)[f|: 

7*(reJ = 3-7<.(reJ-7;9(r.J. (21) 

While in principle one needs to know 7/3 in order to deter- 
mine r^q, we argue below that this term must be small for 
realistic cases that correspond to observed galaxieslf] Given 
this, a solution for r^.^^ must exist. One can see this imme- 
diately, as analyzing the luminosity-weig htec0 average of 
Equation [16] in conjunction with the scalar virial theorem 
(Equation [3} requires that {(7* + 7o- — 3)(t^) — 0. Since 
(T^(r) is positive definite, it must be true that there exists 
at least one radius where 7* = 3 — 7^ . More specifically, for 
typically observed stellar profiles, 7*(r) changes from be- 
ing close to zero (cored) in the center to larger than 3 in 
the outer parts (to keep the stellar mass finite). (For ex- 
ample, for a Plummer profile transitions from to 5.) 



° For /3 profiles tiiat are close to isotropic, solving for r^^^ is not 
necessary, as the right-hand side of Equation 1201 is close to 
everywhere. 

^ Note that for anisotropic parameterizations that become close 
to isotropic, 7^ may be large. However, the combination fi'yp = /3' 
is still well-behaved. 

The integral is actually number-weighted, but we map number 
density to luminosity density in a one-to-one manner. 



The changes in 'yair) are more benign (see Equation IA7|) . 
Putting these facts together, we see that unless 7/3 is very 
large in magnitude. Equation [2T] will have a solution. 

In order to determine the value of M(r^^) we manipulate 
Equation IA5I in order to isolate the relationship between 
cr^ir) and (o"f^^). 



7. (0{'^L) = [(7* + 7..) (1 -/?)+/? + P'] ^l- 



(22) 



Here, the quantity 7^ (r) is dimensionless and depends only 
observable functions: 



dR^ 



di?2 ^ 



(23) 



Note that in the limit where a^^^ is constant we have 7^ (r) = 
7*(r-), which arises by utilizing an Abel inversion (Equation 



2)1. Now we may use Equations 1161 [211 and 1221 to show 



M{tJ ^ -y.ir )G'' (a?) 



(24) 



As mentioned above, for generic cases the value of r^^ 
will depend on /3(r) and thus our ignorance of /3(r) is now 
translated to r^.^. However, as we discuss in the next section, 
if the observed crj^^(J?) does not vary much compared to 
E*(7?) (as is true for most spheroidal systems), then r^^ ~ 
and 7£(r^q) ~ 3. More generally, each galaxy will have a 
different r^^, which can be searched for numerically using 
Equation l20l in conjunction with the family of M{r) and /3(r) 
profiles that solve the Jeans equation. When we actually 
perform this analysis on real galaxies using our maximum 
likelihood approach, we find that the likelihoods for r^ peak 



3.2.2 Spatially constant velocity dispersion anisotropy 

In this section, we assume that /3{r) is constant and show 
that r^q is close to . We start with Equation [22] and set 
/?' = to yield: 

/3 



3(-L) 



^tot(r3)- 



(25) 



'3-2^ 

We have assumed that CTi^^ varies slowly with radius such 
that 7j ~ 3. Of course, physically, aj^^^ has to decrease as R 
approaches the stellar limiting radius, but we find numeri- 
cally that the relation above is still a good approximation 
as long as the variations in the observed a^^^ are mild at 
7? ~ R^. Equation 1251 tells us that if 7CT(r3) is small and (3 is 
constant, then o-'^^^irs) ^ 3{a^^J. This provides one justifi- 
cation for the second step in Equation [T7] 

We now turn to a more detailed computation of a^^^ {i^ ) 
to elucidate the role of 70-, without explicitly assuming that 
a^^^{R) is constant. Consider the average total velocity dis- 
persion written explicitly as an integral over a^, 



An 



r^n^, a'; (3- 2/3) din r. 



(26) 



In realistic cases, n* will vary significantly with radius 
from a flat inner profile with 7* = at small r to a steep pro- 
file with 7* > 3 at large r. Thus the integrand is expected to 
be single peaked unless ar varies in an unexpectedly strong 
way to compensate for the behavior of n*. However, since 
observed cTj^^ profiles do not vary much with position in the 
sky, ar{r) must also vary smoothly with radius (at least for 
constant /3; see Equation I A9p . Thus the integrand will peak 
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at r = To- such that 7*(ro-) +7ct(?'ct) = 3. We may then use a 
saddle point approximation after a Taylor expansion of the 
natural logarithm of the integrand about , approximating 
the integral as a Gaussian 



ATvA{ra 



exp 











-(in 


r 




dlnr 











47r 



2ti 



■Air 



where 

A{r) — ri*(r) (T^^j (r), and K[r) — ^'^,{r) + ^'^(r) . 



(27) 



(28) 



Similarly, since r n* peaks at r,, one can repeat the 
analysis of the previous paragraph to write 



1 = 47r 



r n^. d In r ~ 47r 



itflM-^)- (29) 



The term Aira) computed at 7* + 7^ = 3 is different from 
Aij^) at second order in 7CT(r3). Thus, even for moderate 
values of 7o-(r3) we may replace A{rcr) in Equation 1271 with 
A{x^) to find (with the aid of Equations [28l and [29ll : 



7*('^<t) +7a(7'<T) 



t(r3 



(30) 

The last approximation arises by neglecting the first order 
correction in 7^, enabling us to evaluate the terms inside of 
the square root at r = rj. Our numerical mass estimates 
show that the observational error is larger than that due to 
the neglect of the 7^ term. 

Next we take the derivative of Equation 1221 at r = rj: 



71(13) +7^(1-3) =i 71(1-3) ■ 



2/3 



3/3 



(31) 



where we have neglected 70- (rg). From this expression, we 
see that it is only for values of /3 close to unity that the last 
step in Equation[30]is not a good approximation. Such large 
values of constant /3, however, are disfavored by the Jeans 
equation when considering realistic dispersion profiles. This 
may be seen by taking a derivative of the Jeans equation 
(Equation [TB]) at r = rg to write 

7:(r3)+7;(r3)-(3-2/3)(2~7p), (32) 

where we neglected the '^aij^) term and where we set 
M{r) = M{v^){r/x^)'^~'''' . Combining this with Equation 
1311 we require that 



7:(r3) ^ 
6 - 3 7p ' 



(33) 



which shows that /3 values close to 1 are disfavored because 
observations reveal that 7^ (r3 ) is of order unity for systems 
in equilibrium^ With regard to large negative /3 values, 
these extremes are preferred when 7p < 2. We remind the 
reader that in the above arguments we have neglected 70- (rg) 
in keeping with our focus on systems with flat observed ve- 
locity dispersion profiles (see Equation I A9|) . 



Note that if 7p > 2, Equation 1331 yields the unphysical result 
of /3 > 1, implying that 7o-(i'3) should not be neglected. 



As an aside, we note that even if we knew I3{r), un- 
certainties in the inner stellar profile will limit how well we 
recover the slope of the total density profile 7p at . 

Given this, Equation [30] can be considered a good ap- 
proximation. That is, 3{(Tj^^_,) ~ '^tati'^^) P constant and 
as long as the observed a-^^^ does not vary much with position 
on the sky. Our full numerical analysis of observed spectro- 
scopic data show that this is indeed the preferred solution 
of the Jeans equation. This realization, together with Equa- 
tion [161 allows us to derive our mass estimator presented in 
Equation [2] with r^^^ ~ rg. 

3.2.3 General velocity dispersion anisotropy 

Here we provide a qualitative understanding of why our 
mass estimator works well in the general j3{r) case. We 
begin by reconsidering the derivation of now allow- 

ing p to vary with radius. It is clear that the peak in 
the integrand in Equation [26] will shift to a position where 
7o- + 7* + 2/3'/(3 — 2/3) = 3. Thus even if 70- is moderately 
small, the peak may be shifted due to the third term. For 
small values of /3, the typical |/3'|/(3 — 2j3) values are also 
small in our parameterizations (Equations [TT] and I13|l and 
hence the peak is close to as in the constant /3 case. For 
large negative values of /?, the peak of the {a^^^) integrand is 
essentially at r^^ , but this does not imply that r^^ is close to 
r3. However, if I3{r^) is not small, then is constrained 

by Equation 1221 This can be realized because the term that 
determines the shift in the peak of Equation [23 for large 
negative I3{t^) values is 

74^3) + /3'(r3)/(l - Pirs)) « 3{a,t)(r3) 



<d^3)- (34) 



The simplest solution to this equation and Equation [51] 
which is consistent with the Jeans equation is 3{a'^^J ~ 
'^tot ) ''■'^d r^q ~ rg . Our full mass likelihoods derived from 
analyzing observed data confirm this expectation. 

Since we have argued that the mass enclosed within r3 
should be approximately independent of /3(r), we may now 
derive this mass by simply using Equation [5] with /3 = at 



13 '^rii^s) 



G 

3r3 arjra) 



[7*(i-3) + 7<T (13)11 «=o 



G 



3^3 KJ 



/3=0 



G 



(35) 



. In the second line 
for /3 = and our 



This is again Equation [2] with r^^^ — 
we are using the fact that 3 cr^ = a^^^ 
result from the previous section that o^^^^{I3) — (ffot)- 

It is worth emphasizing that the ideal radius for mass 
determination is r3 and not r^^^- one moves away from 
r3, the uncertainty in /3(r) will start dominating over kine- 
matic (or photometric) errors. However, typically the obser- 
vational errors on both r3 and {o'f^J are large enough that 
the slight (~ 15%) difference between r^^^ s-nd will not 
matter. For this reason we have opted to present our results 
using the more familiar deprojected half-light radius in what 
follows. We find that for constant /3 or for our monotonically 
varying /3(r) form, both M{t-^^^) and M(r3) are equally well 
constrained by the data sets we consider when analyzing the 
population as a whole. 

Of course, one expects the expression in Equation [2] 
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Figure 2. Le/t; The half-light masses for Milky Way dSphs (green squares, blue diamonds, red circles), galactic globular clusters (yellow 
stars), dwarf ellipticals (cyan triangles), and ellipticals (pink inverted triangles). The vertical axis shows masses obtained using our 
full likelihood analysis. The horizontal axis shows mass estimates based on our mass estimator. Equation [2] The inset focuses on the 
pre-SDSS (classical) dSphs, where the dotted lines indicate a 10% scatter in our mass estimator. Right: Errors on half-light masses for 
Milky Way dSphs. The vertical axis shows the 68% error width derived from our full likelihood analysis and the horizontal axis shows 
the error width calculated by straightforward error propagation using Equation [21 The agreement between the two demonstrates that 
errors on the mass determinations within the 3D deprojected half-light radius V-^^^ are dominated by observational uncertainties rather 
than theoretical uncertainties associated with /3(r). In both plots and in the inset the solid line indicate s the one-to-o ne relation. The 
stellar velocities used to derive the globular cluster ( GC) masses (in conjunc tion with photomet ry from [Har ris ('l99er))_ were obtained 
from (lowest to highest mass): NG C 5053 llYan fc Cohen 1996). N GC 6171 llPiatek et al.lll994l'l . NGC 288 (Pryor ct aI||l99lD, NGC 



104 jMavor et al.lll983l'l . NGC 362 jFischer et al.lll993l), NGC 5272 jPrvor et allll98^, an d NGC 2419 llBau mgardt ct al. 2009). The 
kinematic data for the classical dSphs were taken fr omlMuiioz et al.l ifioOSl'l : iKoch et al.l 1120071') ; iMateo et al I ||200S 1 ; ,Walker et al.i (|2009i) , 
and data for the post-SDSS dSphs were taken from lMuiioz et al.l ll2006h : ISimon fc Gehal ll2007l '): lGeha et al.l l l2009f). and Willman e t al. (in 



. _ , Gehal ll2007l '): lGeha et al.l l l200gi. 

prep aration). The kinematic data for the ellipticals are as follows (f rom lowest to highest mass): NGC 1 85 toe Riicke et al.ll2006l), NGC 
855 jSimien fc Prugniel2000l), NGC 4478 jSimien fc Prugnielll997al ), NGC 731 jSimien fc Prugniel2000l ). NGC 3853 jSimlenirPrugniell 
Il997bl ). and NGC 499 ("Si mien fc Prugniellll997d ). The photometric data for the MW dSphs, dEs, and ellipticals are referenced in Table 1. 
These specific dwarf ellipticals and ellipticals were chosen because they had extended kinematic data (to R^, ) and showed little rotation. 



to fail in special cases. For example, if the line-of-sight ve- 
locity dispersion declines very rapidly within the half-light 
radius (such that 7o- ~ 7*) then we would expect the mass- 
anisotropy uncertainty to be minimized at a radius smaller 
than - However, if we ignore the very central regions of 
spheroids with supermassive black holes, most dispersion- 
supported galaxies do not show significant declines in their 
stellar velocity dispersion profiles within their half-light 
radii. Indeed, as we now discuss, we find that Equation 135! 
does a remarkably good job at reproducing the masses for 
real galaxies that span a wide dynamic range in luminosity, 
size, and mass - at least under the assumption of spherical 
symmetry. 

3.3 Tests 

The left-hand panel of Figure [2] presents the integrated 
masses within r^^^ ^ obtained using our fiducial likelihood 
analysis for a variety of spheroidal systems plotted against 
the simple mass estimator in Equation [21 We see that this 
formula is accurate over almost eight decades in M^^^^ • de- 
tailed in the caption, we use individual stellar velocity data 



sets in our likelihoods for MW globular clusters and dSphs, 
and published velocity dispersion profiles for the dwarf el- 
liptical galaxies (dEs) and elliptical galaxies (Es). Observed 
properties and derived masses for each of these systems is 
presented in Table 1. 

To demonstrate the accuracy of the normalization in our 
formula we add an inset into Figure [2] which zooms in to 
the region populated by the so-called "classical" (pre-SDSS) 
MW dSphs, since they have the most well-measured and 
spatially extended stellar velocity distributions and well- 
studied photometry. The dashed lines indicate ±10% vari- 
ation about the predicted relation. In the right-hand panel 
of Figure [2l we demonstrate that Equation [2l also provides a 
good measure of uncertainties on M^^^ for the MW dSphiEl 
(compare to Figure [Cl]). The errors on the vertical axis are 
68% likelihoods derived from our analysis, while the errors 
along the horizontal axis are calculated by simply propagat- 
ing the observational errors on r^^^ and cTj^^ through Equa- 
tion [2l This rough agreement is consistent with the Mj^^^ 

Leo IV is not included in the right-hand panel because it is 
has very few accurate kinematic stellar measurements. 
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Meon 3D Holf-light Rodlus [pc] Meon 3D Half-light Rodius [pc] 

Figure 3. The half-light masses of the Milky Way dSphs plotted against r^^^- Left: The solid black line shows the NFW mass profile for 
a field halo of Mjiaio = 3 X 10^ Mq at 2 = expected for a WMAP5 cosmology (c = 11 according to lMaccio et al.ll2008h. where the tw o 
dashed lines correspond to a spread in concentration of AlogjQ(c) = 0.14, as determined by N-bodv simulations l lWechsler et al.|[20o3) . 
The orange dot-dashed line shows the profile for a median Mij^io = 3 X 10^ Mq at 2 = 3. Right: The same data points along with the 
(median c) NFW mass profiles for halos with Mj^aio masses ranging from 3 X 10^ Mq to 3 x lO'^^ Mq (from bottom to top). We note 
that while all but one of the MW dSphs are consistent with sitting within a halo of a common mass (left), many of the dwarfs can also 
sit in halos of various masses (right). There is no indication that lower luminosity galaxies (red circles) are associated with less massive 
halos than the highest mass galaxies (green squares), as might be expected in simple models of galaxy formation. None of these galaxies 
are associated with a halo less massive than Mjiaio — 3 X 10* Mq. 



uncertainty being dominated by observational errors as op- 
posed to the uncertainty in /?, as expected. 

It is worth emphasizing that Equation [2] is not able to 
capture the full uncertainty on the half-light mass in cases 
where the kinematic data does not constrain cr,^^ beyond 
R„ . While our full likelihood procedure naturally takes into 
account any limitations in the data and factors them into 
the resultant mass uncertainty. Equation [2] was derived un- 
der the assumption that CTj^^ remains constant out beyond 
7? ~ R^. The lack of extended kinematic data is mani- 
fest in the more massive galaxies presented in Figure [2] A 
careful examination of the dEs and regular Es (those with 
Mj^2 > 10* Mq) reveals that the errors on the ordinate axis 
are on average 0.05 dex larger than the errors on the ab- 
scissa. Therefore, in cases where extended kinematics are 
not available, if one is willing to assume that an unmea- 
sured velocity dispersion profile does not fall too sharply 
within ~ 1.5 R„ (as is seen in most galaxies with measured 
dispersion profiles that extend this far), then our proposed 
estimator should provide an accurate description of the half- 
light mass and the associated uncertainty (via simple error 
propagation) . If one does not wish to accept the assumption 
of a flat (Tj^^ profile, then adding an error of 0.05 dex to the 
propagated mass error provides a reasonable means to allow 
for a range of P profiles. 

We note that all of the mass modeling presented so 
far has been done by allowing /3(r) to vary according to 
the profile in Equation [11] This allows for /?(r) to vary 
monotonically with three free parameters. All of the results 



quoted in Table 1 allow for this sort of spatial variation 
in I3(r). Though this profile is fairly general and has the 
added virtue that it is reminiscent of the anisotropy of cold 
dark matter particle s found in numerical simulations (e.g., 
ICarlberg et al.l [19971 ). we have also performed our analysis 
using the j3{r) form in Equation 1 131 which allows for an ex- 
tremum within the stellar light distribution. We find that 
even with this unusual family of /3(r) profiles, no bias in 
the mass estimates exists (within either r.^ or r^^^) between 
the two /3(r) forms. However, the errors on M^^, increased 
by roughly 0.05 dex when the (rather extreme) second j3(r) 
form was used. The errors on M{r^) were slightly less af- 
fected. Hence Equation [T] becomes preferable to Equation [2] 
for the most general /3(r) profiles, as long as the required 
photometric measurements (for r.^ ) and kinematic data sets 
(for (cf^J) are good enough to warrant the need for 10% 
accuracy. 

Before moving on, we mention that in Appendix [Cl we 
perform a similar test using our full mass modeling pro- 
cedure against a popular mass estimator for dSphs known 
as the llUingworthl (| 19761 ) approximation. We show that 
the Illingworth formula fails both because it systematically 
under-predicts masses and because it under-predicts mass 
uncertainties. The main reason for the failure is that it was 
derived for mass-foUows-light globular clusters using j3 = Q. 
It was never intended to be generally applicable to dark- 
matter dominated systems like dSphs. 

Lastly, in Appendices IC2I and IC3I we compare Equation 
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2] to the mass estimat ors presented by ISpitzej (| 19691 ) and 



Cappellari et alj (|2006h 



4 DISCUSSION 

We have shown that the integrated mass within the half- 
hght radius of spherically symmetric, dispersion-supported 
systems is very well constrained by line-of-sight kinematic 
observations with only mild assumptions about the spa- 
tial variation of the stellar velocity dispersion anisotropy: 
^1/2 ~ 3G~^ ('^fos) ^1/2 - Mass determinations at larger and 
smaller radii are much more uncertain because of the un- 
certainty in l3(r). In the following two subsections we use 
Mj^2 determinations to examine the dark matter halos of 
MW dSphs and to explore the mass-luminosity relation in 
dispersion-supported galaxies as a function of mass scale. 



4.1 Dwarf spheroidal satellite galaxies of the 
Milky Way 

As an example of the utility of M^^^ determinations, both 
panels of Figure [3] present Mj^^^ vs. r^^^ for MW dSph galax- 
ies. We have used our full mass likelihood approach in deriv- 
ing these masses and associated error bars, though had we 
simply used Equation[2]the result would have been very sim- 
ilar. In interpreting this figure, it is important to emphasize 
that the galaxies represented here span almost five orders 
of magnitude in luminosity. Relevant parameters for each of 
the galaxies are provided in Table 1. The symbol types la- 
beled on the plot correspond to three wide luminosity bins 
(following the same scheme represented in Figure [2]). Note 
that among galaxies with the same half-light radii, there is 
no clear trend between luminosity and density. We return to 
this noteworthy point below. 

It is interesting now to compare the data points in 
Figure [3] to the integrated mass profile M{r) predicted for 
ACDM halos of a given Mhaio mass. We define Mhaio as the 
halo mass corresponding to an overdensity of 200 compared 
to the critical density. In the limit that dark matter halo 
mass profiles M(r) map in a one-to-one way with their Mhaio 
mass (|Navarro et al. Ill997l ). then the points on this figure 
may be used to estimate an associated halo mass for each 
galaxy. The association is not perfect for three reasons: 1) 
some scatter exists in ha l o concentration at fi xed mass and 
redshift (e.g.. |jin!d[200ol : iBuUock et al.ll200ll ): 2) the map- 
ping between M(r) and M haio evolves slightly with redshift 
(e.g.. iBuUock et al.|[200ll ): and 3) the MW satellites all re- 
side within subhal os, which tend to lose m ass after accretion 
from the field (see lKazantzidis et al.|[20o3 v Nevertheless, we 
may still examine the median M(r) dark matter halo pro- 
file for a given Mhaio in order to provide a reasonable es- 
timate their progenitor halo masses prior to accretion onto 
the Milky Way. 

The solid line in th e left panel of Figure [3] shows the 
mass profile for a NFW jNavarro et al. |[l997l ') dark matter 
halo at z = with a halo mass Mhaio = 3 x 10^ Mq. We have 
used the med ian concentration (c = 11) predicted by the 
Bullock et ai] §001) mass-concentration model updated by 
Maccio et aTl pOOS) for WMAP5 ACDM parameters. The 
dashed lines indicate the expected 68% scatter about the 
median concentration at this mass. The orange dot-dashed 



line shows the expected M{r) profile for the same mass 
halo at z = 3 (corresponding to a concentration of c = 4), 
which provides an estimate of the scatter that would result 
from the scatter in infall times. We see that each MW dSph 
is consistent wi th inhabiting a dar k matter halo of mass 
~ 3 X 10^ M0 (|Strigari et al.ll2008l ). IWalker et all l|2009bD 
recently submitted an article that presented a similar result 
for Milky Way dSphs by examining the mass within a ra- 
dius r — rather than r = r^^^ ^ have done. Note 
that since ~ O.TSrj^^^j the mass within r = is still 
somewhat constrained without prior knowledge of /3. 

The right panel in Figure|3]shows the same data plotted 
along with the median mass profiles for several different halo 
masses. Clearly, the data are also consistent with MW dSphs 
populating dark matter halos of a wide range in Mhaio. As 
described in lStrigaxT et all (|2008l ). there is a weak power-law 
relation between a halo's inner mass and its total mass (e.g., 
M(300pc) oc Ml/^^^ at Mhaio ==; 10^ Mq), and this makes 
a precise mapping between the two difficult. Nevertheless, 
several interesting trends are manifest in the comparison. 

First, all of the MW dSphs are associated with ha- 
los more massive than Mhaio — 10* M©. This provides a 
very stringent limit on the fraction of the baryons con- 
verted to stars in these systems. More importantly, there 
is no systematic relationship between dSph luminosity and 
the Mhaio mass profile that they most closely intersect. The 
ultra-faint dSph population (red circles) with Ly < 10,000 
L0 is equally likely to be associated with the more mas- 
sive dark matter halos as are classical dSphs that are more 
than 1,000 times brighter (green squares). Indeed, a naive 
interpretation of the right-hand panel of Figure [3] shows 
that the two least luminous satellites (which also have the 
smallest M^^^ s-nd r^^^ values) are associated with halos 
that are either more massive than any of the classical MW 
dSphs (green squares), or have abnormally large concentra- 
tions (reflecting earlier collapse times) for their halo mass. 
This general behavior is difficult to reproduce in models 
const ructed to confront the Milky Way satellite population 
fe.g..lKoposoy et al.)l2009l:lLi et al.ll2"009l:lMaccio et al.ll2009l: 



'Munoz et al.' '2009': S alvadori fc Ferraral l2009l : iBusha etld] 
i2010< : [Kravtsov.,2010i '). which typically predict a noticeable 
trend between halo infall mass and dSph luminosity. It is 
possible th at we are seeing evid ence for a new scale in galaxy 
formation jStrigari et aLllioOSh or that there is a systematic 
bias that makes less luminous galaxies that sit within low- 
mass halos m ore difficult to detect than their more mass ive 
counterparts l|Bovill fc Ricottill2009l : iBuUock et al.ll2009l 'l. 



4.2 The global population of dispersion-supported 
stellar systems 

A second example of how accurate M^^^ determinations may 
be used to constrain galaxy formation scenarios is presented 
in Figure |4l where we examine the relationship between 
the half-light mass M^^^^ ^-nd the half-light I-band luminos- 
ity Lj^^2 = 0.5 Li for the full range of dispersion-supported 
stellar systems in the Universe: globular clusters, dSphs, 
dwarf ellipticals, ellipticals, brightest cluster galaxies, and 
extended cluster spheroids. Each symbol type is matched to 
a galaxy type as detailed in the caption. We provide three 
representations of the same information in order to high- 
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Ml/2 ['^o] M,/2 [Mq] l-bond Luminosity [Lg,] 



Figure 4. Le/i; The half I-band luminosity L^^^ vs. half-light mass M^^^ fo'' ^ broad population of spheroidal galaxies. Middle: The 
dynamical I-band half-light mass-to-light ratio T^^^ vs. Mj^^^ relation. Right: The equivalent T^^^ vs. total I-band luminosity Lj = 2Ljy2 
relation. The solid line in the left panel guides the eye with Mj^^^ = ^1/2 ™ solar units. The solid, colored points are all derived using 
our full mass likelihood analysis and their specific symbols/colors are linked to galaxy types as described in Figure (2] The I-band 
luminosities for the MW dSph and GC population were determined by adopting M92's V — I = 0.88. All open, black points are 
taken from the literature as follows. Those with M^^^ > 10* 1^0 s-i's modeled using Equation [2] with CTj^^ and r^^^ culled from the 
compilation of jZaritsky et al.| 1I2OO6I ): triangles for dwarf ellipticals jGeha et al.||20 03|), inverse tria ngles for ellipticals l |j0rgensen et all 
119961 : iMatkovic fc Guzma ^120051). plus signs for brightest cluster galaxies llOegerle fc Hoesse]|[T99lh . and asterisks for cluster spheroids, 
which, following lZaritskv et al.l | |2006| 'I. include the combination of the central brightest cluster galaxy and the extended intracluster light. 
Stars indicate globular clusters, with the subset of open, black stars taken from .Prvor fc Mevlan (,1993 '). 



light different aspects of the relationships: M^^^ ^s. L^^^ (l^ft 
panel); the dynamical I-band mass-to-light ratio within the 
half-light radius T'^^ vs. Mj^^^ (middle panel); and T'^^ vs. 
total I-band luminosity Li (right panel). 

Masses for the colored points are derived using our full 
mass likelihood approach and follow the same color and sym- 
bol convention as in Figure [51 All of the black points that 
represent galaxies were modeled using Equation[2]with pub- 
lished (Jj^^ and r values from the literature The middle 
and right panels are inspired by (and qualitatively consistent 
with) Figures 9 and 10 from lZaritskv et al.l l|2006l ). who pre- 
sented estimated dynamical mass-to-light ratios as a func- 
tion of (Tj^^ for spheroidal galaxies that spanned two orders 
of magnitude in a^^^ . 

We n ote that the asteris ks in Figure |4] are cluster 
spheroids l|Zaritskv et al.l [20061 ). which are defined for any 
galaxy cluster to be the sum of the extended low-surface 
brightness intracluster light component and the brightest 
cluster galaxy's light. These two components are difficult 
to disentangle, but the total light tends to be dominated 
by the intracluster piece. One might argue that the total 
cluster spheroid is more relevant than the brightest cluster 
galaxy because it allows one to compare the dominant stel- 
lar spheroids associated with individual dark matter halos 
over a very wide mass range self consistently. Had we in- 
cluded analogous diffuse light components around less mas- 
sive galaxies (e.g., stellar halos around field ellipticals) the 
figure would change very little, because halo light is of min- 
imal imp ortance for the total luminosity in less massive sys- 
tems fsee lPurcell et al.ll2007l '). One concern is that the cen- 
tral cluster spheroid mass estimates here suffer from a po- 
tential systematic bias because they rely on the measured 

^•^ The masses for the open, black st ars (globular clusters) were 
taken directly from I Prvor fc MevlanI jl993 V 



velocity dispersion of cluster galaxies for a^^^ rather than 
the velocity dispersion of the cluster spheroid itself, which 
is very hard to measure (|Zaritskv et al.l [20061 ) 1^ For com- 
pleteness, we have included brightest cluster galaxies on this 
diagram (plus signs) and they tend to smoothly fill in the re- 
gion between large elliptical galaxies (inverse triangles) and 
the cluster spheroids (asterisks). 

There are several noteworthy aspects to Figure U which 
are each highlighted in a slightly different fashion in the 
three panels. First, as seen most clearly in the middle and 
right panels, the dynamical half-light mass-to-light ratios 
of spheroidal galaxies in the Universe demonstrate a min- 
imum at ~ 2 — 4 that spans a remarkably broad 
1/2 

range of masses M^^^^ ~ 10®"^^ M© and luminosities Li ~ 
■|^g8.5-io.5 interesting to note the offset in the aver- 

age dynamical mass-to- light ratios between globular clusters 
and ellipticals, which may suggest that even within r^^^i 
dark matter may constitute the majority of the mass con- 
tent of L* elliptical galaxies. Nevertheless, it seems that dark 
matter plays a clearly dominant dynamical role (T^^^ > 5) 
within r , „ in only the most extreme systems (see simil ar re- 
sults by babringhausen et al.ll200^ : iForbes et al. I l2008l . who 
study slightly more limited ranges of spheroidal galaxy lu- 
minosities). The dramatic increase in dynamical half-light 
mass-to-light ratios at both smaller and larger mass and lu- 
minosity scales is indicative of a decrease in the efficiency 
of galaxy formation in the smallest and largest dark mat- 
ter halos. It is worth mentioning that a qualitatively similar 
trend in the relationship between Mhaio and L must exist if 
ACDM is to explain the luminosity function of galaxies (e.g., 

In addition, concer ns exist with th e assu mption of dynamical 
equilibrium. However, IWillman et al. I 1I2OO4I) demonstrated with 
a simulation that using the intracluster stars as tracers of cluster 
mass is accurate to ~ 10%. 
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White fc Reeslll978l: iMarinoni fc HudsonI 120021: lYang et all 



20031 : IConrov fc Wechsleill2009l : iMoster et all l2010h . While 
the relationship presented in Figure [4] focuses on a differ- 
ent mass variable, the similarity in the two relationships is 
striking, and generally encouraging for the theory. 

One may gain some qualitative insight into the physical 
processes that drive galaxy formation inefficiency in faint vs. 
bright systems by considering the M^^^^ vs. L^^^ relation (left 
panel) in more detail. We observe three distinct power-law 
regimes M^^^ oc L^^^ with ^ < 1, ^ ~ 1, and ^ > 1 as mass 
increases. Over the broad middle range of galaxy masses, 
Mj/, ~ 10^-" Mq, mass and light track each other quite 
closely with ^ — 1, while very faint galaxies obey ^ ~ 1/2, 
and bright elliptical galaxies have ^ ~ 4/3 transitioning to 
^ ^ 1 for the most luminous cluster spheroids. One may 
interpret the transition from ^ < 1 in faint galaxies to ^ > 1 
in bright galaxies as a transition between mass-suppressed 
galaxy formation to luminosity-suppressed galaxy forma- 
tion. That is, for faint galaxies < 1), we do not see any 
evidence for a low-luminosity threshold in galax;y formation, 
but rather we are seeing behavior closer to a threshold (min- 
imum) mass with variable luminosity. For brighter spheroids 
with ^ > 1, the increased dynamical mass-to- light ratios are 
driven more by increasing the mass at fixed luminosity, sug- 
gestive of a maximum luminosity scale. 

Regardless of the interpretation of Figure IH it provides 
a useful empirical benchmark against which theoretical mod- 
els can be compared. Interestingly, two of the least luminous 
dSph satellites of the Milky Way have the highest dynamical 
mass- to-light ratios r\^^ ~ 3,200 of any collapsed structures 
shown, including intra-cluster light spheroids, which reach 
values of ~ 800. It is well known that the ultra-faint 

1/2 

dSph s are the most dark matter dominated objects known 
(e.g., IStrigari et al.l |2008| ). For example, they have much 
lower baryon-to-dark matter fractions ~ fib /0dm ^ 10~^ 
than galaxy clusters ft, ~ 0.1. Now we see that ultra- faint 
dSphs also have higher dynamical mass-to-visible light ra- 
tios within their stellar extents than even the (well-studied) 
galaxy cluster spheroids. 



5 CONCLUSIONS 

We have shown that line-of-sight kinematic observations en- 
able accurate mass determinations for spherical, dispersion- 
supported galaxies within a characteristic radius that is 
approximately equal to , the radius where the log-slope 
of the stellar density profile is —3. For a wide range of 
observed spheroidal galaxy stellar luminosity profiles rg is 
close to the 3D deprojected half-light radius T-^^^, and we 
have opted to quote our main result in terms of the mass 
enclosed within r^^^- While mass determinations at both 
larger and smaller radii remain uncertain because of the 
unknown velocity anisotropy (§3.1), the half-light mass is 
accurately determined by the simple expression M^^, = 
3G~^ (af^J T-^^^ ~ 4:G~^ {a^^JR^ as long as the velocity 
dispersion profile (Tj^^(i?) remains relatively fiat out to the 
2D projected half-light radius R^. We derived this expres- 
sion analytically using a few observationally-motivated as- 
sumptions in §3.2, and demonstrated its accuracy over eight 
orders of magnitude in both luminosity and in M^^^ by com- 
paring it to detailed modeling of real galax;y data in §3.3. 



The two main assumptions we have made in this work are 
that the systems that we are analyzing are spherically sym- 
metric and are in dynamical equilibrium. Testing the ac- 
curacy of Equation [2] as a function of ellipticity will be an 
important future step. 

As an example of the usefulness of the M^^^ estimator, 
we applied our result to the dSph satellite population of the 
Milky Way and specifically used the observed M^^^ vs T-^^^ 
relation to associate a dark matter halo Mhaio mass to each 
galaxy. By allowing for the expected scatter in halo concen- 
trations at fixed mass, we showed that all of the MW dSphs 
are consistent with inhabiting dark matter halos of mass 
Mhaio — 3 X 10^ Mq. We also showed that a range of Mhaio 
values from ~ 10* M© to 3 x 10^^ Mq is allowable as well, but 
that no trend exists between the associated Mhaio and galaxy 
luminosity, despite the fact that these galaxies span over four 
orders of magnitude in luminosity. Specifically, the lowest lu- 
minosity dSphs (Lv — 5OOL0) are at least as dense as, if not 
more dense than, the brightest MW dSphs (Lv — W^Lq) 
when normalized against the inner power-law mass profiles 
expected in ACDM halos. This last point is difficult to repro- 
duce in models that assume a monotonic mapping between 
Mhaio and galaxy luminosity. It is worth emphasizing that 
none of the MW dSphs are associated with dark matter ha- 
los smaller than Mhaio — 10* Mq, and this alone provides 
a very tight constraint on the fraction of baryons converted 
to stars in these systems. Of course, these results assume 
that no systematic biases in the kinematic data for dSph 
galaxies are present. One partic ular worry is the effect of 
binary stars. [Minor et all (|2010l ) estimate that medium-to- 
high binary fractions can infiate velocity dispersions by up 
to ~ 20% in the smallest dSphs. This will have to be taken 
into account in future work, at least for the classical dwarfs 
that only have ~ 10% errors on their M^^^ estimates. 

We went on to explore the relationship between M^^^^ 
and Li in dispersion-supported galaxies, spanning the full 
range in I-band luminosity and mass from globular clus- 
ters (Li ~ IO^Lq) to intra-cluster light spheroids (Li ~ 
IO^^Lq). Globular clusters excluded, the T^^^ vs. M^^^ re- 
lation for dispersion-supported galaxies follows a U-shape, 



with a broad minimum near T? , 



3 that spans dwarf el- 



liptical galaxies to normal elliptical galaxies, a steep rise to 
~ 3,200 for ultra-faint dSphs, and a more shallow rise 

1/2 

to T] ~ 20 for brightest cluster ellipticals. If we include 

1/2 

intra-cluster light spheroids in the analysis, the rise contin- 
ues to ~ 800 for the largest galaxy clusters. 

Lastly, we note that Equation [2] can be rewritten suc- 
cinctly in terms of the circular velocity at r^^^ as 



Vci.c(r,,J = ^/3«J. 



(36) 



It is clear then that the maximum circular velocity of the 
dark matter halo hosting such a dispersion-supported galaxy 
must obey Vmax > V3{af^J. 

In summary, we have shown that the dynamical mass 
within the deprojected half-light radius of dispersion- 
supported galaxies can be measured accurately with 
only line-of-sight stellar velocity measurements. We have 
provided a simple formula that allows this mass to be 
computed given the measured luminosity-weighted square 
of the line-of-sight velocity dispersion and the half-light 
radius. This result opens up new opportunities to explore 
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the relationships between stellar properties and the masses 
of galaxies spanning approximately ten orders of magnitude 
in luminosity. 
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APPENDIX A: AN EXPRESSION FOR MASS AS A FUNCTION OF OBSERVABLES 



Here we derive a single expression for the mass profile of spheroidal galaxies M(r ; /3) as a function of the observable combination 
We begin by manipulating the standard equation for (T^^^ in order to isolate the R dependence into an integral kernel: 



o2 



dr^ 



Vr2 - _R2 



(Al) 



r2 r2 - i?2 



ij2 ^r'^ — J?2 



dr^ - v/H~~R2 



— — dr 



H2 Jh2 



dr" ^ 



1 dr^ 



2 - i?2 



B2 



+ 



/3n*(j^ 2 



2f2 



-dr 



dr'' 



Vr2 - J?2 



where we employed an integration by parts to achieve the third equality. Note that we have set the middle term on the third 
line to zero by making the physically-motivated assumption that the combination /3n*cr^ falls faster than r~^ at large r. 
With this crucial manipulation in place, we may now utilize the following Abel inversion 



/(^) 



g(^)dt 

\Jt — x 



git) 



1 f°° df dx 
n 



dx ^x ~ t 



to solve for 



in terms of the observable combination f{R?) = S* a'^^^{R?): 



(A2) 



(A3) 



(A4) 

In order to isolate n*cr^ we equate Equations IA3I and IA41 and then differentiate the resulting expression with respect to 
Inr (denoted by ') 



(n*(Tr)' / 2\ „/\ 2r f 



2 /-oo j2 



d2(S.<J dR^ 

2 (dR^y y/W^ 



and employ the integrating factor 



h{r) = exp 



In r 



P + P' 



d In f 



/ma 1-/? 

with the constant a chosen such that the value of the integrand goes to zero at the lower limit: 



(A5) 



(A6) 
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n^a'^{r;P) = 
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dR^ 



2 (d7?2)2 y^R2 _ f2 
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h 
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P~l VR^ - r2 



(di?2)2 • 



(A7) 



If one wishes to adopt a parametric form for PJr), ?i* cr^ can be determined using Eguation lATI and then inserted into the 
Jeans equation to find the cumulative mass profile!^ Note that nothing guarantees a physical mass profile (i.e., the mass never 
decreases); given a very large number of stellar velocities with very low measurement errors, one can restrict the anisotropy 
such that a physical mass is derived. 

If P{r) is assumed to be constant, then the inner integral of the right-hand side of Equation I ATI can be written in terms 
of the incomplete Beta function: 



B4p,q)= / y^-'il-yy-'dy. 
Jo 



By utilizing the substitution u = 1 — /R? , we find 



ni,(jl(r-p) 



f)/(l-f)) foo 1-28 

R 1-/5 B 



TviP - 1) 



1 2-3/3 \ d (^*""loJ ip2 

2 ' 2(1-/3) I ' ' "-^^-^ 



(di?2 



(A8) 



(A9) 



° In the final stages of this work, we learned of an alternative derivation performed by iMamon fc Bouel l|20ld ') who provide single 
integral expressions for both constant anisotropy and special cases of /9(r). 
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By solving the Jeans equation we can derive the mass by first taking a derivative of Equation I ATI and then inserting the form 
derived in Equation I A9I 

Mir'P)^7^, TTTTIT /_ -R'^^T^2^^(^'-R;/5)di?' (AlO) 

where 



G7r(/3- l)n*(r) (dR^) 



With this relation we have replaced the dependence of deriving the mass of a dispersion-supported system from the 
unknown radial dispersion ar{r) with the second derivative of the observable combination o-j^^^ (i?). Note that determining 
the slope of the mass profile will require an additional derivative, and thus we will require extremely accurate observational 
constraints on both the light profile and the line-of-sight velocity dispersion. We conjecture that the data will need to be so 
precise that the assumption of spherical symmetry will no longer do the data proper justice, and thus new derivations must 
be explored. 



APPENDIX B: USEFUL CONVERSIONS FROM 2D TO 3D HALF-LIGHT RADII 

In this Appendix we present scaling relations to derive the 3D deprojected half-light radi us r, „ from the observed 2D 
projected half-light radius for several commonly used stellar distributions. For the iKind l| 19621 ') profile, Rq — r^^j^ and 

Cfc = logj^o(''iim/''core )• ^ SctsIc profilc is defined as I{R) — /(0)e~'" (^/^o' ^ , where b„ is chosen such that Kg = R„. Note 
that although the exponential and Gaussian profiles are special cases of the n=l and n=0.5 Sersic profiles, the R^/Rp 
relations are different due to the definitions of their scale radii: an exponential profile is defined as I{R) = l(0)e~^^^o and a 
Gaussian profile is defined as I{R) — l{0)e o . 

Profile Re/R-o T^i/i/K ra/fi/s 



Exponential 


1.678 


1.329 


1.15 


Gaussian 


1.178 


1.307 


1.13 


King (c;==0.70) 


1.185 


1.322 


1.13 


Plummer 


1.000 


1.305 


0.94 


Sersic (n=2) 


1.000 


1.342 


1.16 


Sersic (n=4) 


1.000 


1.349 


1.17 


Sersic (n=8) 


1.000 


1.352 


1.18 



We do not include the NFW profile due to the fact that the mass is divergent (thus, r^^^ ~^ oo). We also do note include 
the lEinastd ll 19651) pr ofile in this table because it does not well represent the baryonic tracer number density of most galaxies. 



The lHernquistI (|l990l ) profile is sometimes used in place of a Sersic profile due to the ease of analytic manipulation. But we 



caution, as was pointed out in the original paper, that the projected central surface brightness diverges logarithmically. This 
can cause to be quite large in magnitude, thus affecting the solution to Equation [21] more profoundly than if the more 
well-behaved Sersic profile is used to model a tracer population. 

Returning to the relations presented in the above table, for a King profile 

RJRo = 0.5439 -I- 0.1044cfc -I- 1.5618cfc - 0.7559^ + 0.2572ct (Bl) 

to better than 2% accuracy for 0.30 < Ck < 3.00, and to better than 1% accuracy for 0.40 < Ck < 3.00. Also, 

r^^, /R„ = 1.3088 + 0.0159cfc + 0.0066cl - 0.0035^ + 0.0004ct (B2) 

to better than 0.04% accuracy for 0.30 < Ct < 3.00. Thus, the dominant error is in the relation between R^ and Rq. 

In regard to the family of Sersic profiles, as stated above, R^ = Re - To relate T-^^^ to R^, we utilize the following fit, which 
iLima Neto et al.l (|l999l ') state is valid to 0.25% accuracy after testing against the numerical integration of the family of Sersic 
profiles corresponding to 0.10 < < 2.0: 

ri/2/Rc = 1-3560 - 0.0293n"^ + 0.0023n"^ (B3) 

Thus, rj^,/R^ ~ 4/3 is accurate to better than 2% for most surface brightness profiles used to describe observed stellar 
systems. We als o note tha t this result h as been shown before for a wide range of Sersic profiles in ICiottil l|l99ir i and for the 
IPlummeJ (|l91ll ) profile in lSpitzeJ ([l983). 
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Figure CI. Left: The masses within the stellar extent for Milky Way dSphs. The vertical axis shows masses derived using individual 
stellar kinematics with our full likelihood procedure (see text) and the horizontal axis shows the "lUingworth approximation" , which is 
routinely used in the literature as a mass estimate for dSphs. Right: Errors on these masses for Milky Way dSphs. The vertical axis shows 
the 68% error width derived from our full likelihood analysis and the horizontal axis shows the error width calculated by straightforward 
error propagation using Eguation lCll Note that this approximation tends to underestimate masses by up to an order of magnitude (left) 
and also under-estimates the relative error on the mass significantly (right). In both plots the solid line indicates the one-to-one relation. 



APPENDIX C: COMPARISON WITH OTHER MASS APPROXIMATION FORMULAE 
CI Illingworth Formula 



Due t o t he lar^e amount of attention that dSphs have received since new discoveries JWillman et al. 2005al lbl: IZucker et al.l 
l2006al lbl: iBelokurov et al.l |2006| . l2007l: IS akamoto fc Hasegawal l2006l : llrwin et all l2007l : IWalsh et all |2007| ) were found in the 
public data releases of the SDSS ( York ct al. 200^), we will discuss an estimator that is often used to determine their masses. 
Because many d Sphs look like larger versions of low-concentration globular clusters, the Illingworth for mula (derived by 
lUingworthl (Il97(^) fo r appl i cation only to globular c l usters) is often used to estimate the masses of dSphs (e.g.. lSeitzer fc Frogell 



19851 : ISuntzeff et al.l [19931 : iHargreaves et all [Tooi : iMated \l99i : ISimon fc Gehal 120071 '). Two explicit assu mptions ma de by 



this formula are that the stellar velocity dispersion is isotropic and that the mass distribution follows a iKind l| 19661 ) light 
distribution. Under these assumptions, the total mass within the stellar extent r^^ is stated as 



Miiw = 167 ^ r^^j^ o-Q G ^, 



(CI) 



where ao is the central line-of-sight velocity dispersion of the system, r^^^^ is the King core radius and /i is a parameter that 
depends on the King concentration, ct = logmffi^^ /fn^r, ). H is comm on in the literature to set /i = 8 (incorrectly) for all 



dSphs based on a rough estimate provided in Table 4 of iMateol (|l998l ). By adopting a value of fi without any error, many 
published mass uncertainties for dSphs do not properly include light profile uncertainties, which are typically only factored in 
from the error on r^,^^^. More important, however, is the implicit assumption that mass follows light in this formulation. While 
this is a reasonable assumption for globular cluster systems, the maj ority of th e mass in dwarf galaxies^ docs not necessarily 
follow the shape of their baryonic tracers (e.g.. lSofue fc RubinI | 200"ll: IWalker et aljfeoOTi : jPefiarrubia et al. ||200i), as they are 
likely to be deeply embedded inside of dark matter halos (e.g.. White fc Reed 1978 ). 

The left panel of Figure [CTI compares the masses M(rjjj^) of Milky Way dSphs derived using our general approach 
to Eguation lCll Symbol types correspond to luminosi ty bins, a s indicat ed. For th e general mass likelihoods, wc analyz e the 
kinematics of individual stars (Munoz et al. 2005 . 200d: lKoch"et al. 2007 : ,Martin et al...2007. : . Simon fc Goha 2007; Mate o et all 
l2008l : IWalkCT et al. l l2009al : lGeha et allbOOdT Willman et al., in preparationj^^l. in conjunction with the distances and stellar 
surface density profile parameters listed in Table 1. For the Illingworth approximation, we use the same observational datasets 
to calculate ao (which is very close to the luminosity-weighted dispersion since the dispersion profiles for the MW dSphs are 
nearly constant with radius) and we follow the common practice of setting jj, — 8. Clearly, Mnw systematically underestimates 



We only accept stars whose projected distances lie within the lower limit of rjj 
probabilities, we only accept those with p > 0.9. 



(see Table 1). For kinematics with assigned membership 
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Figure C2. Left: Ratio of tlie luminosity-weigiited square of tiie dispersion within projected radius R divided by the luminosity-weighted 
square of the dispersion integrated to infinity for two different dispersion model s . The dashed lines (model 1) are derived by considering 
the median dispersion profile model presented in Equation 1 of lCappellari et alj ll2006 l): a^^^{R) oc ij"""^^. The solid lines (model 2) use 
the same relation to within only one effective radius. After i? = R^, the dispersion profile is assumed to be flat as a function of projected 
radius. The three projected Sersic surface brightness profiles modeled are, from top to bottom, n=6 (blue), n=4 (red), and n=2 (black). 
Right: Comparison of the derived V-band mass-to-light ratios derived using our general two-component spherical Jeans models (x-axis) 
compared to those obtained under the assumption that mass follows light and (5 = (y-axis). The solid line represents the one-to-one 
relation. The four galaxies modeled, from top to bottom, are NGC 4478, NGC 731, NGC 185, and NGC 855. 



the mass with this value of fi. This systematic difFerence follows from the fact that Mhw forces the mass profile to truncate at 
r,jj^ while the data prefer models where the mass distribution continues beyond the stellar extent. 

However, the most dramatic difference between the full mass likelihoods and the lUingworth approximation is in the 
implied uncertainty. Errors on the vertical axis represent the 68% width from the median of our derived mass likelihoods, 
while the symbol placement is indicative of the median of the likelihood. The errors on the horizontal axis propagate the 
observational errors on r^^^^ and ctq using Equation ICll It is clear that using this equation underestimates the relative error 
on the mass. As we have discussed, the uncertainty in the mass within r^^ is dominated by the velocity anisotropy, which 
is not accounted for in the Mii„ equation, as it was derived under the assumption of isotropy. The right panel of Figure ICll 
shows a comparison between the logj^Q mass error in both cases. 

In conclusion, the lUingworth approximation, which was derived to only be applied on globular clusters, is a very poor 
estimate of the mass and mass uncertainty for dSph galaxies. 



C2 Spitzer Formula 

In this subsection we slightly modify the mass estimator presented in ISpitzeil (|l969l l. by halving their total mass, to better 
compare to our mass estimator: 

MfJI = 3.75G-'{al}r,^,. (C2) 

Despite the fact that this equation was derived by analyzing polytropes with indices between n=3 and n=5, which is a very 
restrictive class of mass d ensities that describe m ulti-component galaxies, our coefficient in Equation [5] is only 20% under the 
Spitzer coefficient of 3.75. lLokas fc MamonI (|200ll ) find coefficients in much better agreement with ours when analyzing a wide 
variety of NEW halos, which better represent the mass density of real galaxies. 



C3 Cappellari et al. Dynamical Mass-to-Light Ratio 

Using axisymmetric Jeans modeling, ICappellari et al.) (|2006l ) (hereafter COG) empirically find the following relation assuming 
a single-component mass-foUows-light (MfL) density distribution: 

r - GL ^ ^^/^ - gl;;^ ' ^^-^^ 
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where {o'^^^)-^ is the luminosity- weighted square of the line-of-sight dispersion within R^. In practice, C06 determined 
by extrapolating the measured luminosity- weighted square of the dispersion within the observational aperture Rap. The 
physical aperture size varies depending on the system, but it typically corresponds to Rap — 0.7 R^ for the data that C06 
analyzed. We continue the convention presented within this paper, where T1/2 is the dynamical mass-to- light ratio within the 
3D deprojected half-light radius r^^,. Let us rewrite our Equation [2] in order to facilitate comparison with the COG relation: 

-rwio 4 (g-f^J R^ 

JLi/2 =—777 . (<^4j 

'-^^1/2 

where we remind the reader that (fj^^^) is the luminosity- weighted square of the line-of-sight dispersion over the entire galaxy. 
In the limit that the observed velocity dispersion profile is perfectly flat at all radii we would expect {'j'^^^)^^ ~ (""fos)- ^^is 
case the COG estimator (with a coefficient of 2.5) is smaller by ~ 40% compared to ours (with a coefficient of 4). 

We explore three possible reasons for this difference in the coefficients. First, our analysis is explicitly spherical while 
the COG models are axisymmetric. COG addresses this concern by comparing their axisymmetric model results to spherical 
Jeans model results under the assumption that the velocity dispersion is isotropic (/3 = 0). In this comparison, they find little 
difference in their dynamical mass-to-light ratios. While this is reassuring, it is not entirely general given the assumption of 
/3 = in their comparison. It remains to be seen if the geometric freedom becomes important in comparison to more general 
spherical models with variable (3. In principl e, projec t ion ef fects can add an additional ~ 20 — 30% uncertainty to spherical 
mass estimates as discussed, for example, bv lCavazz 1 l|2005h . However, it would be surprising if these effects were systematic 
in biasing mass estimates. 

A second possible reason for the difference in our coefficients is that {cr^^^)^^^ 7^ (""loa)- Indeed, the median dispersion 
profile studied by COG falls with projected radius as a\oa{R) oc 7^-0. 066 ^^^^ ^j^g range probed by their data such that we 
would expect (o"j^^)j^ > {o''^^^)- In the left panel of Figure [C2l we plot the ratio of the luminosity weighted square of the 
velocity dispersion as measured within an aperture radius R ((o"f^g)jj) to the total luminosity- weighted square of the velocity 
dispersion ((crj^^)) as a function of R for two models of aios{R) velocity dispersion profiles and several light profiles (Sersic 
profiles with n = 2, 4, and G). Model 1 curves (dashed) assume the median COG power-law for aiosiR) extends to all -R. Model 
2 curves (solid) assume the median COG power-law for gios{R) until i? = R^,, and then a fiat dispersion profile for larger 
radii. This modification of the COG model is motivated by the behavior of d i spersion profiles of galaxies seen in high quality 
kinematics that extend out to several effective radii (e.g.. Proctor et al.ll2009l : IWeiimans et al.ll2009l : [Geha et al.ll2oTol '). As can 



be seen in Figure [C2] we expect {o^^^)^ /("^loa) — ~ ^■'^ ^'^^ typical aperture size (Rap — 0.7R^) in the data that COG 

COf 

1/2 



analyzed. This result allows us to approximate the COG formula as TJ^'j — 2.8{af^JTi^/{GL^ ), bringing the ratio of the 



COG coefficient and our coefficient to within ~ 30%. 

A third difference between our method and that of COG is that we have allowed the dark matter mass profile to be distinct 
from the light profile, while COG assume that mass follows light. In principle, the MIL assumption can impose a bias because 
we expect the dynamical mass-to-light ratio to increase with radius. In the right panel of Figure [C2l we explore this issue by 
comparing the dynamical V-band mass-to-light ratios of four galaxies derived using our general methodology to those derived 
under the assumption of MfL and 13 = 0. This MfL model mirrors that shown by COG to reproduce their axisymmetric results. 
The four galaxies modeled (from top to bottom: NGC 4478, NGC 731, NGC 185, and NGC 855) were chosen as they had the 
lowest TY^2 values in Table 1. We see that two of the galaxies have median MfL mass-to-light ratios that are lower by ~ 35% 
than those derived for the general spherical case The other two galaxies do not show large differencesl3 Thus, it is possible 
that the MfL assumption can give rise to biases as large as 30%, even in systems that are not dark matter dominated. 

Future investigations that allow for non-spherical, multicomponent mass models will be important for investigating the 
advantages and limitations of the current set of assumptions that are often used in Jeans analyses. 



The reason for the abscissa errors being larger than the ordinate errors is related to the additional freedom that we allow in our 
modeling, particularly with regard to /3, as we discuss in Section [3] 

We note that the values derived from an anisotropic MfL model agree well with those in Table 1, as expected from Equation[2l 
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